Neural network pruning has been a well-established compression technique to enable deep learning models on resource-constrained devices. The pruned model is usually specialized to meet specific hardware platforms and training tasks (defined as deployment scenarios). However, existing pruning approaches rely heavily on training data to trade off model size, efficiency, and accuracy, which becomes ineffective for federated learning (FL) over distributed and confidential datasets. Moreover, the memory- and compute-intensive pruning process of most existing approaches cannot be handled by most FL devices with resource limitations. In this paper, we develop FedTiny, a novel distributed pruning framework for FL, to obtain specialized tiny models for memory- and computing-constrained participating devices with confidential local data. To alleviate biased pruning due to unseen heterogeneous data over devices, FedTiny introduces an adaptive batch normalization (BN) selection module to adaptively obtain an initially pruned model to fit deployment scenarios. Besides, to further improve the initial pruning, FedTiny develops a lightweight progressive pruning module for local finer pruning under tight memory and computational budgets, where the pruning policy for each layer is gradually determined rather than evaluating the overall deep model structure. Extensive experimental results demonstrate the effectiveness of FedTiny, which outperforms state-of-the-art baseline approaches, especially when compressing deep models to extremely sparse tiny models.
translated by 谷歌翻译
Face Animation是计算机视觉中最热门的主题之一,在生成模型的帮助下取得了有希望的性能。但是,由于复杂的运动变形和复杂的面部细节建模,生成保留身份和光真实图像的身份仍然是一个关键的挑战。为了解决这些问题,我们提出了一个面部神经量渲染(FNEVR)网络,以充分探索在统一框架中2D运动翘曲和3D体积渲染的潜力。在FNEVR中,我们设计了一个3D面积渲染(FVR)模块,以增强图像渲染的面部细节。具体而言,我们首先使用精心设计的体系结构提取3D信息,然后引入一个正交自适应射线采样模块以进行有效的渲染。我们还设计了一个轻巧的姿势编辑器,使FNEVR能够以简单而有效的方式编辑面部姿势。广泛的实验表明,我们的FNEVR在广泛使用的说话头基准上获得了最佳的总体质量和性能。
translated by 谷歌翻译
几乎所有的多代理强化学习算法没有交流,都遵循分散执行的集中培训原则。在集中培训期间,代理可以以相同的信号为指导,例如全球国家。但是,在分散执行期间,代理缺乏共享信号。受到观点不变性和对比学习的启发,我们在本文中提出了共识学习,以学习合作的多代理增强学习。尽管基于局部观察结果,但不同的代理可以在离散空间中推断出相同的共识。在分散执行期间,我们将推断的共识作为对代理网络的明确输入提供了,从而发展了他们的合作精神。我们提出的方法可以扩展到具有小模型更改的各种多代理增强学习算法。此外,我们执行一些完全合作的任务,并获得令人信服的结果。
translated by 谷歌翻译
Recently, model-based agents have achieved better performance than model-free ones using the same computational budget and training time in single-agent environments. However, due to the complexity of multi-agent systems, it is tough to learn the model of the environment. The significant compounding error may hinder the learning process when model-based methods are applied to multi-agent tasks. This paper proposes an implicit model-based multi-agent reinforcement learning method based on value decomposition methods. Under this method, agents can interact with the learned virtual environment and evaluate the current state value according to imagined future states in the latent space, making agents have the foresight. Our approach can be applied to any multi-agent value decomposition method. The experimental results show that our method improves the sample efficiency in different partially observable Markov decision process domains.
translated by 谷歌翻译
Recently, some challenging tasks in multi-agent systems have been solved by some hierarchical reinforcement learning methods. Inspired by the intra-level and inter-level coordination in the human nervous system, we propose a novel value decomposition framework HAVEN based on hierarchical reinforcement learning for fully cooperative multi-agent problems. To address the instability arising from the concurrent optimization of policies between various levels and agents, we introduce the dual coordination mechanism of inter-level and inter-agent strategies by designing reward functions in a two-level hierarchy. HAVEN does not require domain knowledge and pre-training, and can be applied to any value decomposition variant. Our method achieves desirable results on different decentralized partially observable Markov decision process domains and outperforms other popular multi-agent hierarchical reinforcement learning algorithms.
translated by 谷歌翻译
作为分散的部分观察到的马尔可夫决策过程(DEC-POMDP)问题的解决方案之一,最近的价值分解方法已经实现了显着的结果。然而,大多数值分解方法需要在训练期间的环境完全可观察状态,但这在一些场景中是不可行的,在某些情况下可以获得不完整和嘈杂的观察。因此,我们提出了一种新颖的值分解框架,命名为值分解(侧)的状态推断,这消除了通过同时寻求最佳控制和状态推断的两个问题来了解全局状态的需要。侧面可以扩展到任何值分解方法,以解决部分可观察的问题。通过比较星际II微型管理任务中的不同算法的性能,但我们验证了没有可访问状态,方面可以推断基于过去的本地观测的增强学习过程,甚至在一些基础上实现卓越的结果复杂的情景。
translated by 谷歌翻译
先前的关于自我监督预训练的研究重点是联合培训方案,在该场景中,假定大量未标记的数据一次性地将其作为输入,只有那时才受过培训的学习者。不幸的是,这种问题设置通常是不切实际的,即使不是不可行的,因为许多现实世界的任务依赖于顺序学习,例如,数据是以流方式分散或收集的。在本文中,我们对通过流数据进行了对自我监督的预训练进行了首次彻底而专门的研究,旨在阐明这种被忽视的设置下的模型行为。具体而言,我们在来自ImageNet和域内的四类预训练流数据数据上预先培训超过500个模型,并在三种类型的下游任务和12个不同的下游数据集上对其进行评估。我们的研究表明,以某种方式超出了我们的期望,通过简单的数据重播或参数正则化,顺序的自我监督预训练的预训练证明是联合预训练的有效替代方法,因为前者的性能主要与这些培训相同后者。此外,灾难性的遗忘是顺序监督学习中的一个常见问题,在顺序的自学学习(SSL)中得到了极大的缓解,这是通过我们对损失景观中最小值的表示和敏锐度的全面经验分析来很好地证明的。因此,我们的发现表明,在实践中,对于SSL,可以主要通过顺序学习来代替繁琐的联合培训,这反过来又可以更广泛的潜在应用方案。
translated by 谷歌翻译
自行车分享系统(BSSS)在全球越来越受欢迎,并引起了广泛的研究兴趣。本文研究了BSSS中的需求预测问题。空间和时间特征对于BSSS的需求预测至关重要,但提取了时尚动态的需求是挑战性的。另一个挑战是捕捉时空动力学和外部因素之间的关系,例如天气,一周和一天时间。为了解决这些挑战,我们提出了一个名为MSTF-Net的多个时空融合网络。 MSTF-Net由多个时空块组成:3D卷积网络(3D-CNN)块,Eidetic 3D卷积长短短期存储网络(E3D-LSTM)块,以及完全连接的(FC)块。具体地,3D-CNN嵌段突出显示在每个片段中提取短期时空依赖(即,亲近,期间和趋势); E3D-LSTM块进一步提取对所有碎片的长期时空依赖; FC块提取外部因素的非线性相关性。最后,融合E3D-LSTM和FC块的潜在表示以获得最终预测。对于两个现实世界数据集,显示MSTF-Net优于七种最先进的模型。
translated by 谷歌翻译
With the advanced request to employ a team of robots to perform a task collaboratively, the research community has become increasingly interested in collaborative simultaneous localization and mapping. Unfortunately, existing datasets are limited in the scale and variation of the collaborative trajectories, even though generalization between inter-trajectories among different agents is crucial to the overall viability of collaborative tasks. To help align the research community's contributions with realistic multiagent ordinated SLAM problems, we propose S3E, a large-scale multimodal dataset captured by a fleet of unmanned ground vehicles along four designed collaborative trajectory paradigms. S3E consists of 7 outdoor and 5 indoor sequences that each exceed 200 seconds, consisting of well temporal synchronized and spatial calibrated high-frequency IMU, high-quality stereo camera, and 360 degree LiDAR data. Crucially, our effort exceeds previous attempts regarding dataset size, scene variability, and complexity. It has 4x as much average recording time as the pioneering EuRoC dataset. We also provide careful dataset analysis as well as baselines for collaborative SLAM and single counterparts. Data and more up-to-date details are found at https://github.com/PengYu-Team/S3E.
translated by 谷歌翻译
图形神经网络(GNNS)由于图形数据的规模和模型参数的数量呈指数增长,因此限制了它们在实际应用中的效用,因此往往会遭受高计算成本。为此,最近的一些作品着重于用彩票假设(LTH)稀疏GNN,以降低推理成本,同时保持绩效水平。但是,基于LTH的方法具有两个主要缺点:1)它们需要对密集模型进行详尽且迭代的训练,从而产生了极大的训练计算成本,2)它们仅修剪图形结构和模型参数,但忽略了节点功能维度,存在大量冗余。为了克服上述局限性,我们提出了一个综合的图形渐进修剪框架,称为CGP。这是通过在一个训练过程中设计在训练图周期修剪范式上进行动态修剪GNN来实现的。与基于LTH的方法不同,提出的CGP方法不需要重新训练,这大大降低了计算成本。此外,我们设计了一个共同策略,以全面地修剪GNN的所有三个核心元素:图形结构,节点特征和模型参数。同时,旨在完善修剪操作,我们将重生过程引入我们的CGP框架,以重新建立修剪但重要的连接。提出的CGP通过在6个GNN体系结构中使用节点分类任务进行评估,包括浅层模型(GCN和GAT),浅但深度散发模型(SGC和APPNP)以及Deep Models(GCNII和RESGCN),总共有14个真实图形数据集,包括来自挑战性开放图基准的大规模图数据集。实验表明,我们提出的策略在匹配时大大提高了训练和推理效率,甚至超过了现有方法的准确性。
translated by 谷歌翻译